Replica consistency in a Data Grid
نویسندگان
چکیده
A Data Grid is a wide area computing infrastructure that employs Grid technologies to provide storage capacity and processing power to applications that handle very large quantities of data. Data Grids rely on data replication to achieve better performance and reliability by storing copies of data sets on different Grid nodes. When a data set can be modified by applications, the problem of maintaining consistency among existing copies arises. The consistency problem also concerns metadata, i.e., additional information about application data sets such as indices, directories, or catalogues. This kind of metadata is used both by the applications and by the Grid middleware to manage the data. For instance, the Replica Management Service (the Grid middleware component that controls data replication) uses catalogues to find the replicas of each data set. Such catalogues can also be replicated and their consistency is crucial to the correct operation of the Grid. Therefore, metadata consistency generally poses stricter requirements than data consistency. In this paper we report on the development of a Replica Consistency Service based on the middleware mainly developed by the European Data Grid Project. The paper summarises the main issues in the replica consistency problem, and lays out a high-level architectural design for a Replica Consistency Service. Finally, results from simulations of different consistency models are presented. r 2004 Elsevier B.V. All rights reserved.
منابع مشابه
Increasing performance in Data grid by a new replica replacement algorithm
Data Grid provides sharing services for very large data around the world. Data replication is one of the most effective approaches to reduce access latency and response time. In addition to the benefits, replication has costs such as storage and bandwidth consumption, especially when storage space is low and limited. Therefore, the data replacement should be done wisely. In this p...
متن کاملAn Efficient Data Replication Strategy in Large-Scale Data Grid Environments Based on Availability and Popularity
The data grid technology, which uses the scale of the Internet to solve storage limitation for the huge amount of data, has become one of the hot research topics. Recently, data replication strategies have been widely employed in distributed environment to copy frequently accessed data in suitable sites. The primary purposes are shortening distance of file transmission and achieving files from ...
متن کاملIncreasing Replica Consistency Performances with Load Balancing Strategy in Data Grid Systems
Data replication in data grid systems is one of the important solutions that improve availability, scalability, and fault tolerance. However, this technique can also bring some involved issues such as maintaining replica consistency. Moreover, as grid environment are very dynamic some nodes can be more uploaded than the others to become eventually a bottleneck. The main idea of our work is to p...
متن کاملModels for Replica Synchronisation and Consistency in a Data Grid
Data Grids are currently proposed solutions to large scale data management problems including efficient file transfer and replication. Large amounts of data and the world-wide distribution of data stores contribute to the complexity of the data management challenge. Recent architecture proposals and prototypes deal with replication of read-only files but do not address the replica synchronisati...
متن کاملA Load Balancing Strategy for Replica Consistency Maintenance in Data Grid Systems
In data grid environment, the management of shared data is one of the major scientific challenges. Data replication is one of the important techniques used in grid systems to increase the availability, scalability and fault tolerance. However, the update of a replica might bring a critical problem of replica consistency maintenance. Thus, maintaining the consistency of the replicas is not trivi...
متن کامل